5 resultados para 080109 Pattern Recognition and Data Mining

em Duke University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. METHODS/PRINCIPAL FINDINGS: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of "what if" situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. CONCLUSION/SIGNIFICANCE: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An enterprise information system (EIS) is an integrated data-applications platform characterized by diverse, heterogeneous, and distributed data sources. For many enterprises, a number of business processes still depend heavily on static rule-based methods and extensive human expertise. Enterprises are faced with the need for optimizing operation scheduling, improving resource utilization, discovering useful knowledge, and making data-driven decisions.

This thesis research is focused on real-time optimization and knowledge discovery that addresses workflow optimization, resource allocation, as well as data-driven predictions of process-execution times, order fulfillment, and enterprise service-level performance. In contrast to prior work on data analytics techniques for enterprise performance optimization, the emphasis here is on realizing scalable and real-time enterprise intelligence based on a combination of heterogeneous system simulation, combinatorial optimization, machine-learning algorithms, and statistical methods.

On-demand digital-print service is a representative enterprise requiring a powerful EIS.We use real-life data from Reischling Press, Inc. (RPI), a digit-print-service provider (PSP), to evaluate our optimization algorithms.

In order to handle the increase in volume and diversity of demands, we first present a high-performance, scalable, and real-time production scheduling algorithm for production automation based on an incremental genetic algorithm (IGA). The objective of this algorithm is to optimize the order dispatching sequence and balance resource utilization. Compared to prior work, this solution is scalable for a high volume of orders and it provides fast scheduling solutions for orders that require complex fulfillment procedures. Experimental results highlight its potential benefit in reducing production inefficiencies and enhancing the productivity of an enterprise.

We next discuss analysis and prediction of different attributes involved in hierarchical components of an enterprise. We start from a study of the fundamental processes related to real-time prediction. Our process-execution time and process status prediction models integrate statistical methods with machine-learning algorithms. In addition to improved prediction accuracy compared to stand-alone machine-learning algorithms, it also performs a probabilistic estimation of the predicted status. An order generally consists of multiple series and parallel processes. We next introduce an order-fulfillment prediction model that combines advantages of multiple classification models by incorporating flexible decision-integration mechanisms. Experimental results show that adopting due dates recommended by the model can significantly reduce enterprise late-delivery ratio. Finally, we investigate service-level attributes that reflect the overall performance of an enterprise. We analyze and decompose time-series data into different components according to their hierarchical periodic nature, perform correlation analysis,

and develop univariate prediction models for each component as well as multivariate models for correlated components. Predictions for the original time series are aggregated from the predictions of its components. In addition to a significant increase in mid-term prediction accuracy, this distributed modeling strategy also improves short-term time-series prediction accuracy.

In summary, this thesis research has led to a set of characterization, optimization, and prediction tools for an EIS to derive insightful knowledge from data and use them as guidance for production management. It is expected to provide solutions for enterprises to increase reconfigurability, accomplish more automated procedures, and obtain data-driven recommendations or effective decisions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Like other vertebrates, primates recognize their relatives, primarily to minimize inbreeding, but also to facilitate nepotism. Although associative, social learning is typically credited for discrimination of familiar kin, discrimination of unfamiliar kin remains unexplained. As sex-biased dispersal in long-lived species cannot consistently prevent encounters between unfamiliar kin, inbreeding remains a threat and mechanisms to avoid it beg explanation. Using a molecular approach that combined analyses of biochemical and microsatellite markers in 17 female and 19 male ring-tailed lemurs (Lemur catta), we describe odor-gene covariance to establish the feasibility of olfactory-mediated kin recognition. RESULTS: Despite derivation from different genital glands, labial and scrotal secretions shared about 170 of their respective 338 and 203 semiochemicals. In addition, these semiochemicals encoded information about genetic relatedness within and between the sexes. Although the sexes showed opposite seasonal patterns in signal complexity, the odor profiles of related individuals (whether same-sex or mixed-sex dyads) converged most strongly in the competitive breeding season. Thus, a strong, mutual olfactory signal of genetic relatedness appeared specifically when such information would be crucial for preventing inbreeding. That weaker signals of genetic relatedness might exist year round could provide a mechanism to explain nepotism between unfamiliar kin. CONCLUSION: We suggest that signal convergence between the sexes may reflect strong selective pressures on kin recognition, whereas signal convergence within the sexes may arise as its by-product or function independently to prevent competition between unfamiliar relatives. The link between an individual's genome and its olfactory signals could be mediated by biosynthetic pathways producing polymorphic semiochemicals or by carrier proteins modifying the individual bouquet of olfactory cues. In conclusion, we unveil a possible olfactory mechanism of kin recognition that has specific relevance to understanding inbreeding avoidance and nepotistic behavior observed in free-ranging primates, and broader relevance to understanding the mechanisms of vertebrate olfactory communication.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Confronting the rapidly increasing, worldwide reliance on biometric technologies to surveil, manage, and police human beings, my dissertation Informatic Opacity: Biometric Facial Recognition and the Aesthetics and Politics of Defacement charts a series of queer, feminist, and anti-racist concepts and artworks that favor opacity as a means of political struggle against surveillance and capture technologies in the 21st century. Utilizing biometric facial recognition as a paradigmatic example, I argue that today's surveillance requires persons to be informatically visible in order to control them, and such visibility relies upon the production of technical standardizations of identification to operate globally, which most vehemently impact non- normative, minoritarian populations. Thus, as biometric technologies turn exposures of the face into sites of governance, activists and artists strive to make the face biometrically illegible and refuse the political recognition biometrics promises through acts of masking, escape, and imperceptibility. Although I specifically describe tactics of making the face unrecognizable as "defacement," I broadly theorize refusals to visually cohere to digital surveillance and capture technologies' gaze as "informatic opacity," an aesthetic-political theory and practice of anti- normativity at a global, technical scale whose goal is maintaining the autonomous determination of alterity and difference by evading the quantification, standardization, and regulation of identity imposed by biometrics and the state. My dissertation also features two artworks: Facial Weaponization Suite, a series of masks and public actions, and Face Cages, a critical, dystopic installation that investigates the abstract violence of biometric facial diagramming and analysis. I develop an interdisciplinary, practice-based method that pulls from contemporary art and aesthetic theory, media theory and surveillance studies, political and continental philosophy, queer and feminist theory, transgender studies, postcolonial theory, and critical race studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We recently developed an approach for testing the accuracy of network inference algorithms by applying them to biologically realistic simulations with known network topology. Here, we seek to determine the degree to which the network topology and data sampling regime influence the ability of our Bayesian network inference algorithm, NETWORKINFERENCE, to recover gene regulatory networks. NETWORKINFERENCE performed well at recovering feedback loops and multiple targets of a regulator with small amounts of data, but required more data to recover multiple regulators of a gene. When collecting the same number of data samples at different intervals from the system, the best recovery was produced by sampling intervals long enough such that sampling covered propagation of regulation through the network but not so long such that intervals missed internal dynamics. These results further elucidate the possibilities and limitations of network inference based on biological data.